Become a Bayesian in
10 minutes

Probabilities

Which interpretation do you prefer?

If I asssume a value of zero for the parameter, what is the probability of my observed parameter or more extreme?

Or

What’s the probability my result is greater than zero?

Intervals

Which interpretation do you prefer?

If I repeat this study precisely an infinte number of times, and I calculate a 95% interval each time, then 95% of those intervals will contain the true parameter.

Or

What’s the probability the parameter falls in this interval?

Perks

Intuitive results

Auto-regularization

  • Guards against overfitting

Intervals for anything you can calculate

Distributions

The goal is a distribution (stable) rather than a parameter

Posterior distribution

Prior

Represents our perspective regarding the intial state of affairs

Based on

Prior belief

Prior research

Known approaches that work well in the modeling context

The prior is subjective?!?

Subjective != Arbitrary

For example…

Subjective

Choosing a normal distribution for the likelihood

Subjective

Setting a prior variance to some value

Arbitrary

Setting your null hypothesis parameter value to 0

Arbitrary

Choosing .05 as a cutoff for ‘significance’

Which of the following is R code for the Bayesian model?

(mpg ~ wt, data=mtcars)
(mpg ~ wt, data=mtcars)
(mpg ~ wt, data=mtcars)

Which of the following is R code for the Bayesian model?

lm(mpg ~ wt, data=mtcars)
(mpg ~ wt, data=mtcars)
(mpg ~ wt, data=mtcars)

Which of the following is R code for the Bayesian model?

lm(mpg ~ wt, data=mtcars)
stan_lm(mpg ~ wt, data=mtcars)   # rstanarm
(mpg ~ wt, data=mtcars)

Which of the following is R code for the Bayesian model?

lm(mpg ~ wt, data=mtcars)
stan_lm(mpg ~ wt, data=mtcars)   # rstanarm
brm(mpg ~ wt, data=mtcars)       # brms

Logistic Regression

glm(treat ~ educ + black + hisp + married, data=lalonde, family='binomial')
stan_glm(treat ~ educ + black + hisp + married, data=lalonde, family='binomial')   # rstanarm
brm(treat ~ educ + black + hisp + married, data=lalonde, family='binomial')        # brms

Ordinal Regression

clm(rating ~ temp*contact, data = wine)                       # ordinal
stan_polr(rating ~ temp*contact, data = wine)                 # rstanarm
brm(rating ~ temp*contact, data = wine, family='ordinal')     # brms

Mixed Model

lmer(Reaction ~ Days + (1 + Days|Subject))        # lme4
stan_lmer(Reaction ~ Days + (1 + Days|Subject))   # rstanarm
brm(Reaction ~ Days + (1 + Days|Subject))         # brms

The Point
















{data}

Rstan and associated packages make it easy to be a Bayesian

Things you’ll need to learn

Settings

Debugging

Diagnostics

Model comparison

Issues

Big data

Very complex models